Model Selection

High-precision Action Recognition

# High-precision Action Recognition

Xclip Large Patch14 16 Frames

X-CLIP is an extension of CLIP for general video-language understanding, achieving video classification and video-text retrieval tasks through contrastive learning.

Transformers English

Xclip Large Patch14

X-CLIP is an extension of CLIP for general video-language understanding, trained via contrastive learning on (video, text) pairs.

Transformers English

Xclip Base Patch32 16 Frames

X-CLIP is an extended version of CLIP for general video-language understanding, trained on video-text pairs via contrastive learning, suitable for tasks like video classification and video-text retrieval.

Transformers English

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase